Model Selection

multimodal feature extraction

# multimodal feature extraction

Internvit 6B 448px V1 0

InternViT-6B-448px-V1-0 is a vision foundation model focused on image feature extraction, supporting 448x448 resolution with enhanced OCR capabilities and improved Chinese dialogue support.

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase